AITopics | self-interpretable model

1387a00f03b4b423e63127b08c261bdc-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 19:13:05 GMT

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
(3 more...)

Add feedback

Self-InterpretableModelwithTransformation EquivariantInterpretation

Neural Information Processing SystemsFeb-7-2026, 13:55:11 GMT

Withthe proliferation ofmachine learning applications inthe real world, the demand for explaining machine learning predictions continues to grow especially in high-stakes fields. Recent studies havefound that interpretation methods can be sensitive and unreliable, where the interpretations can be disturbed by perturbations or transformations of input data. To address this issue, we propose to learn robust interpretations through transformation equivariant regularization in a self-interpretable model.

artificial intelligence, interpretation, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Self-Interpretable Model with Transformation Equivariant Interpretation

Neural Information Processing SystemsDec-23-2025, 19:08:35 GMT

With the proliferation of machine learning applications in the real world, the demand for explaining machine learning predictions continues to grow especially in high-stakes fields. Recent studies have found that interpretation methods can be sensitive and unreliable, where the interpretations can be disturbed by perturbations or transformations of input data. To address this issue, we propose to learn robust interpretation through transformation equivariant regularization in a self-interpretable model. The resulting model is capable of capturing valid interpretation that is equivariant to geometric transformations. Moreover, since our model is self-interpretable, it enables faithful interpretations that reflect the true predictive mechanism. Unlike existing self-interpretable models, which usually sacrifice expressive power for the sake of interpretation quality, our model preserves the high expressive capability comparable to the state-of-the-art deep learning models in complex tasks, while providing visualizable and faithful high-quality interpretation. We compare with various related methods and validate the interpretation quality and consistency of our model.

interpretation, self-interpretable model, transformation equivariant interpretation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Appendix for " Self-Interpretable Model with Transformation Equivariant Interpretation "

Neural Information Processing SystemsOct-2-2025, 11:22:01 GMT

Please refer to the Appendix 5 for details. Here we only present correctly classified examples. Here the first rows are the untransformed and the transformed images, while the second rows are the corresponding interpretations. This is a supplement to Figure 1 in the main body of the paper . To illustrate the comparison results consistently, we use heatmaps of the same settings to visualize the interpretations of all methods.

dataset, heatmap, interpretation, (14 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.69)

Add feedback

Self-Interpretable Model with Transformation Equivariant Interpretation

Neural Information Processing SystemsOct-9-2024, 14:02:09 GMT

With the proliferation of machine learning applications in the real world, the demand for explaining machine learning predictions continues to grow especially in high-stakes fields. Recent studies have found that interpretation methods can be sensitive and unreliable, where the interpretations can be disturbed by perturbations or transformations of input data. To address this issue, we propose to learn robust interpretation through transformation equivariant regularization in a self-interpretable model. The resulting model is capable of capturing valid interpretation that is equivariant to geometric transformations. Moreover, since our model is self-interpretable, it enables faithful interpretations that reflect the true predictive mechanism.

interpretation, self-interpretable model, transformation equivariant interpretation, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation

Wu, Yinjun, Keoliya, Mayank, Chen, Kan, Velingker, Neelay, Li, Ziyang, Getzen, Emily J, Long, Qi, Naik, Mayur, Parikh, Ravi B, Wong, Eric

arXiv.org Artificial IntelligenceJun-2-2024

Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers for black-box models lack faithfulness guarantees, and self-interpretable models greatly compromise accuracy. To address these issues, we propose DISCRET, a self-interpretable ITE framework that synthesizes faithful, rule-based explanations for each sample. A key insight behind DISCRET is that explanations can serve dually as database queries to identify similar subgroups of samples. We provide a novel RL algorithm to efficiently synthesize these explanations from a large search space. We evaluate DISCRET on diverse tasks involving tabular, image, and text data. DISCRET outperforms the best self-interpretable models and has accuracy comparable to the best black-box models while providing faithful explanations. DISCRET is available at https://github.com/wuyinjun-1993/DISCRET-ICML2024.

dataset, discret, explanation, (15 more...)

arXiv.org Artificial Intelligence

2406.00611

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > Austria > Vienna (0.14)
Africa > Uganda (0.06)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

BET: Explaining Deep Reinforcement Learning through The Error-Prone Decisions

Liu, Xiao, Zhao, Jie, Chen, Wubing, Tan, Mao, Su, Yongxing

arXiv.org Artificial IntelligenceJan-14-2024

Despite the impressive capabilities of Deep Reinforcement Learning (DRL) agents in many challenging scenarios, their black-box decision-making process significantly limits their deployment in safety-sensitive domains. Several previous self-interpretable works focus on revealing the critical states of the agent's decision. However, they cannot pinpoint the error-prone states. To address this issue, we propose a novel self-interpretable structure, named Backbone Extract Tree (BET), to better explain the agent's behavior by identify the error-prone states. At a high level, BET hypothesizes that states in which the agent consistently executes uniform decisions exhibit a reduced propensity for errors. To effectively model this phenomenon, BET expresses these states within neighborhoods, each defined by a curated set of representative states. Therefore, states positioned at a greater distance from these representative benchmarks are more prone to error. We evaluate BET in various popular RL environments and show its superiority over existing self-interpretable models in terms of explanation fidelity. Furthermore, we demonstrate a use case for providing explanations for the agents in StarCraft II, a sophisticated multi-agent cooperative game. To the best of our knowledge, we are the first to explain such a complex scenarios using a fully transparent structure.

agent, bone, self-interpretable model, (15 more...)

arXiv.org Artificial Intelligence

2401.07263

Country: